feat: Rust recipe runner integration with engine selection#2951
feat: Rust recipe runner integration with engine selection#2951
Conversation
|
🤖 Auto-fixed version bump The version in If you need a minor or major version bump instead, please update |
Repo Guardian - Passed✅ All files are durable repository content Reviewed 6 changed files:
No ephemeral content detected (no meeting notes, temporary scripts, or point-in-time documents).
|
1c447a3 to
902cfa8
Compare
|
🤖 Auto-fixed version bump The version in If you need a minor or major version bump instead, please update |
c5dd179 to
c0d3e13
Compare
|
🤖 Auto-fixed version bump The version in If you need a minor or major version bump instead, please update |
🤖 PR Triage CompleteRisk Level: Medium-High (6.5/10) 📊 SummaryThis PR integrates a Rust recipe runner as an alternative execution engine with automatic selection and graceful fallback to Python. Changes:
|
PR-M1: Split run_recipe_via_rust into focused helpers PR-M2: Configurable timeouts via env vars PR-M3: Remove point-in-time Python references in docs PR-M4: Remove hardcoded counts from docs PR-M5: Add tests for empty results and exception paths PR-L1: Redact context values in log output PR-L2: Lazy binary search path evaluation Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🟡 Triage Result: NEEDS CONFLICT RESOLUTIONPriority: MEDIUM-HIGH | Risk: HIGH AssessmentHigh-value Rust recipe runner integration with manageable scope, but merge conflicts prevent automated merging. Stats:
Blockers❌ Merge conflicts - must be resolved before review Recommended Action
Why This MattersRecipe runner integration is foundational infrastructure. Clean merge critical for:
Related Issues
Automated triage by PR Triage Agent - Run #22827330377
|
Adds the Rust recipe runner binary integration with automatic engine selection and startup dependency management. - src/amplihack/recipes/rust_runner.py: Binary wrapper with find, ensure, and execute functions. RustRunnerNotFoundError for explicit failures. ensure_rust_recipe_runner() auto-installs via cargo if binary is missing. - src/amplihack/recipes/__init__.py: Engine selection via RECIPE_RUNNER_ENGINE env var (rust/python/auto-detect). Exports ensure_rust_recipe_runner. - src/amplihack/install.py: Step 6.5 ensures binary during amplihack install. - tests/recipes/test_rust_runner.py: 26 tests covering discovery, execution, engine selection, and ensure flow. - docs/recipes/README.md: Documents engine selection and auto-install. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Validate RECIPE_RUNNER_ENGINE values (raise ValueError on unknown) - Add non-interactive footer to NestedSessionAdapter - Add session depth tracking to NestedSessionAdapter Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PR-M1: Split run_recipe_via_rust into focused helpers PR-M2: Configurable timeouts via env vars PR-M3: Remove point-in-time Python references in docs PR-M4: Remove hardcoded counts from docs PR-M5: Add tests for empty results and exception paths PR-L1: Redact context values in log output PR-L2: Lazy binary search path evaluation Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…th (C2-PR-1, C2-PR-2, C2-PR-6, C2-PR-9, C2-PR-10) C2-PR-1: Raise ValueError on invalid RECIPE_RUNNER_ENGINE values C2-PR-2: Log full traceback for ensure_rust_recipe_runner failures C2-PR-6: Enforce AMPLIHACK_MAX_DEPTH in execute_agent_step C2-PR-9: Add test for invalid engine value validation C2-PR-10: Add test for execution timeout propagation Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…g_dir (C2-INT-3/4/5/6/7/10) C2-INT-3: Serialize Duration as f64 seconds (Rust repo) C2-INT-4/5/6: Document Rust-only features in engine comparison table C2-INT-7: Document all environment variables C2-INT-10: Resolve working_dir to absolute path to prevent double-application Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ority test (C3-PR-1/2/3) C3-PR-1: Print warning on install exception (was silent) C3-PR-2: Return resolved path from find_rust_binary for env var path C3-PR-3: Fix false-confidence test with discriminating mock Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
33e5b79 to
16928c8
Compare
|
🤖 Auto-fixed version bump The version in If you need a minor or major version bump instead, please update |
Repo Guardian - Passed✅ All files are durable repository content Reviewed 8 changed files:
No ephemeral content detected (no meeting notes, temporary scripts, or point-in-time documents).
|
Replaces Python recipe runner with Rust implementation from rysweet/amplihack-recipe-runner. - Engine selection via RECIPE_RUNNER_ENGINE env var (rust/python/auto-detect) - Auto-installs via cargo on first use (ensure_rust_recipe_runner) - Nested session depth enforcement (AMPLIHACK_MAX_DEPTH) - Non-interactive footer for autonomous agent execution - Configurable timeouts via env vars - Context value redaction in logs - 34 tests covering all paths Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… on new repos without origin/main. ## Problem default-workflow currently assumes that origin/main already exists when it reaches s (#3620) * fix: agent resolver now handles 3-part refs (namespace:category:name) (#2856) Recipes use 3-part agent references like 'amplihack:core:architect' but the resolver only handled 2-part 'amplihack:architect'. The split on the first colon left 'core:architect' as the name, which failed the safety regex. All recipe steps using amplihack:core:* or amplihack:specialized:* silently lost their agent system prompts. Now parses 2-part and 3-part refs correctly, validates each segment independently, and rejects 4+ parts. Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: add continue-on-error to agentic workflow discussion steps (#2853) * fix: add continue-on-error to agentic workflow discussion creation steps Four agentic workflows fail when GitHub Discussions categories aren't available: daily-code-metrics, weekly-issue-summary, issue-classifier, and repo-guardian. Adding continue-on-error: true to the "Process Safe Outputs" step lets them degrade gracefully instead of failing the run. Closes #2749, closes #2790, closes #2745, closes #2756 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * [skip ci] chore: Auto-bump patch version --------- Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * fix: add pre-flight check for claude CLI binary in AutoMode (#2854) * fix: add pre-flight check for claude CLI binary in AutoMode When the claude binary is missing, the Claude Agent SDK hangs or fails silently. This adds: 1. Pre-flight shutil.which("claude") check before SDK initialization 2. Actionable error messages pointing to installation instructions when binary/process errors are detected at runtime Closes #2769 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * [skip ci] chore: Auto-bump patch version --------- Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * fix(drift): only fail on CHANGED files, treat MISSING/EXTRA as warnings (#2857) * fix(drift): only fail on CHANGED files, treat MISSING/EXTRA as warnings Resolves the CI failures caused by check_drift.py exiting non-zero for every PR due to 434 MISSING and 83 EXTRA files that represent intentional structural differences between .claude/ and amplifier-bundle/. Changes: - scripts/check_drift.py: MISSING/EXTRA → warnings (exit 0), CHANGED → errors (exit 1) - Sync 11 CHANGED files from .claude/skills/ to amplifier-bundle/skills/ - Sync 28 CHANGED files from .claude/skills/ to docs/claude/skills/ After this change check_drift.py exits 0 on the current repo state. Future content drift (CHANGED) will still cause CI failure. Closes #2820 follow-up Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [skip ci] chore: Auto-bump patch version --------- Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * docs: streamline Claude CLI installation in PREREQUISITES.md (#2719) - Consolidate 4 installation methods into a clear platform table - Remove deprecated npm instructions (kept deprecation note) - Remove redundant/conflicting information - Simplify auto-installation section Fixes #2371 Co-authored-by: voidborne-d <voidborne-d@users.noreply.github.com> * docs: add reading guide to README and fix broken anchor (#2858) * docs: add reading guide to README and fix broken #feature-catalog anchor Adds a quick navigation guide after the install command. Also fixes the pre-existing broken ToC link: #feature-catalog -> #features (the actual heading is "## Features", not "## Feature Catalog"). Based on PR #2836 with anchor fix applied. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * [skip ci] chore: Auto-bump patch version --------- Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * [docs] Update documentation for merged PRs from March 2-3, 2026 (#2823) * Update documentation for merged PRs from March 2-3, 2026 Updates documentation following Diátaxis framework for 4 merged PRs: ## Recipe Runner Updates ### Recipe Discovery (PR #2813) - Document installed package path support in discovery.py - Update priority order in README and troubleshooting guide - Add v0.9.0 feature callout for pip install support - Explain absolute path resolution via Path(__file__) ### Bash Timeouts (PR #2807) - Add timeout field to step fields table - Document new "Bash Step Timeouts" section - Clarify default behavior: no timeout (None) - Show optional timeout configuration examples ### Adapter Auto-Detection (PR #2804) - Document get_adapter() usage in new reference doc - Explain NestedSessionAdapter selection for CLAUDECODE env - Note heredoc quoting fixes and condition eval improvements ## Skills System Updates ### Skill Frontmatter (PR #2811) - Add "YAML Frontmatter Requirements" section to SKILL_CATALOG.md - Document common mistakes fixed in v0.9.0 - Show correct vs incorrect frontmatter examples - List critical requirements for frontmatter validation ## New Reference Document Created docs/recipes/RECENT_FIXES_MARCH_2026.md consolidating: - All 4 PR fixes with technical details - Root cause analysis for each issue - Solutions and impact statements - Test verification and documentation references Follows Diátaxis framework: - Reference: Technical specifications for discovery, timeouts, frontmatter - How-to: Troubleshooting guides for common issues - Explanation: Why changes were needed and their impact Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * docs: add missing CWD-relative paths to recipe discovery priority list The recipe discovery docs listed 4 search paths but the code has 6. Added the 2 CWD-relative legacy paths (amplifier-bundle/recipes/ and src/amplihack/amplifier-bundle/recipes/) so the docs match the actual discovery.py implementation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: GitHub Actions Bot <noreply@github.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com> Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net> * feat: persist NODE_OPTIONS memory config preference to ~/.amplihack/config (#2860) * feat: persist NODE_OPTIONS memory config preference to ~/.amplihack/config First run prompts the user for NODE_OPTIONS consent and saves the answer to ~/.amplihack/config (JSON). Subsequent runs skip the prompt and emit an informational message showing the saved setting and the config file path so users know how to change it. Changes: - memory_config.py: add get_config_path, load_user_preference, save_user_preference; update get_memory_config to load saved pref and skip prompt for returning users; update display_memory_config to emit info message (with config path) for returning users - test_memory_config.py: add TestConfigPersistence (8 tests) and TestFirstRunVsReturningUser (7 tests) covering first-run and returning-user code paths; all 67 tests pass Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [skip ci] chore: Auto-bump patch version --------- Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * fix: quality-audit must fix ALL findings per cycle, add structured inputs (#2842, #2843) (#2861) Addresses two issues with the quality-audit recipe: - #2842: Each cycle must fix ALL confirmed findings before moving to the next. Added fix-all-per-cycle enforcement rule, verify-fixes bash step that compares confirmed findings against fix results, and updated recurse-decision to check for NEW findings rather than old unfixed ones. - #2843: Added structured inputs (severity_threshold, module_loc_limit, fix_all_per_cycle, categories) to the recipe context so audits are configurable and reproducible without modifying the recipe file. Changes: - quality-audit-cycle.yaml: v3.0.0 → v4.0.0, new context variables, verify-fixes step, strengthened fix step prompt, updated recurse-decision logic - SKILL.md: v3.0 → v4.0, documented fix-all rule, structured inputs table, fix verification step, loop decision based on new findings - 27 outside-in tests covering structured inputs, fix-all enforcement, verify logic, recipe version, and skill documentation Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add RECIPE step type for sub-recipe composition (#2821) (#2862) * feat: add RECIPE step type for sub-recipe composition (issue #2821) Adds the ability to invoke sub-recipes as steps within a recipe YAML file, enabling workflow composition and reuse. Changes: - models.py: Add StepType.RECIPE enum member; add `recipe` and `sub_context` fields to the Step dataclass - parser.py: Parse `type: recipe` steps from YAML (reads `recipe` as sub-recipe name and `context` as dict merged into sub-recipe); add `recipe`/`context` to known step fields; infer StepType.RECIPE when `recipe` field present; validate that recipe steps have a `recipe` field - runner.py: Import find_recipe at module level; add MAX_RECIPE_DEPTH=3 constant; add `_depth` parameter to RecipeRunner; add _execute_sub_recipe() that checks depth guard, merges context, and delegates to a child RecipeRunner; route StepType.RECIPE in _dispatch_step() - tests: 19 new unit tests in test_recipe_step_type.py covering enum, dataclass fields, parser, happy-path execution, context merging, failure propagation, recursion depth guard, and dry-run - docs: Update Step Fields table in docs/recipes/README.md with `recipe`, `context`, and `output` fields; add Recipe Step section with YAML example Closes #2821 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [skip ci] chore: Auto-bump patch version --------- Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * docs: fix README/CONTRIBUTING gaps — session explanation, install order, /dev intro, uv sync (#2865) - #2777: Add explanation of what the interactive session does after install - #2778: Clarify that uv sync installs all deps into local .venv - #2780: Add "install prerequisites first" before install options - #2781: Expand first /dev mention with context about what it does #2782 already addressed (cost info exists). #2783, #2784 closed by owner. Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: distributed hive mind — federation, LearningAgent eval, retrieval pipeline (#2717) feat: distributed hive mind — federation, retrieval pipeline, CRDTs, gossip, eval framework * fix: improve ADO skill auto-activation with keyword-rich descriptions (#2868) - azure-devops: description now includes ADO, work items, user stories, bugs, sprints, builds, releases, Azure DevOps URLs - azure-devops-cli: added auto_activate_keywords for az devops, az pipelines, etc. Closes #2850 Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add transcript-viewer skill for JSONL log reading (#2869) * feat: add transcript-viewer skill for JSONL log reading (#2445) Adds a new Claude Code skill that wraps claude-code-log CLI to convert and browse JSONL session transcripts as HTML or Markdown. Features: - Current session mode: views the most recently modified JSONL transcript - Specific session mode: looks up a session by ID - Agent output mode: views .agent-step-*.log background task files - All sessions mode: lists and browses all project sessions with date-range filtering - Graceful degradation: clear install instructions when claude-code-log is missing - HTML or Markdown output format Includes 18 tests validating all core behaviors (tool detection, mode routing, date filtering, JSONL parsing, YAML frontmatter structure). Closes #2445 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [skip ci] chore: Auto-bump patch version * feat: add GitHub Copilot CLI support to transcript-viewer skill (#2445) Research shows Copilot CLI has no automatic log persistence (unlike Claude Code's ~/.claude/projects/*.jsonl). Sessions are exported manually via `/share markdown`. Changes: - Version bumped to 1.1.0 - Auto-detects log source: JSONL (Claude Code) vs markdown (Copilot /share export) - Context detection uses launcher_detector.py env vars (CLAUDE_CODE_SESSION, GITHUB_COPILOT_TOKEN, COPILOT_SESSION) — defaults to claude-code as safe fallback - Copilot guidance: when in Copilot context with no file, shows /share instructions - Format detection guards against false-positives (no /share in regex) - GITHUB_TOKEN excluded from Copilot markers (too generic for CI environments) - 6 new tests (tests 12-18): format auto-detection, context detection, markdown parsing, false-positive guard, unknown format, plain-log detection - 34/34 tests pass Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: sync quality-audit/SKILL.md drift between source and bundle/docs copies The drift detection CI was failing because quality-audit/SKILL.md had diverged: - Source (.claude/skills): v4.0 with fix-all-per-cycle rule (#2842) - amplifier-bundle/skills: v3.0 (stale copy) - docs/claude/skills: v3.0 (stale copy) Copied source of truth to both locations to resolve the CHANGED drift errors. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(transcript-viewer): support GitHub Copilot CLI JSONL logs instead of markdown exports Copilot CLI auto-saves sessions to ~/.copilot/session-state/*/events.jsonl (JSONL format, not markdown exports). Update the skill to use the correct paths. Changes: - Fix Copilot log path to ~/.copilot/session-state/*/events.jsonl - Add directory-based auto-detection (checks ~/.copilot/session-state/ and ~/.claude/projects/ for sessions before falling back to env vars) - Mode 1: read latest events.jsonl from most recent ~/.copilot/session-state/*/ dir - Mode 2: look up session by directory ID in ~/.copilot/session-state/ - Mode 4: list session dirs in ~/.copilot/session-state/ showing session IDs - Remove incorrect "no auto-save" guidance and /share markdown instructions - Document workspace.yaml, plan.md, and checkpoints/ in session structure - Update tests: replace Copilot markdown tests with Copilot JSONL path tests (20 test groups, 44 assertions — all passing) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: add transcript-viewer documentation and sync to bundle/docs - Created docs/claude/skills/transcript-viewer/README.md with usage guide - Synced SKILL.md to amplifier-bundle/ and docs/ (drift prevention) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * fix(quality-audit-cycle): prevent JSON from being interpreted as bash commands in verify-fixes (#2886) * fix(quality-audit-cycle): prevent JSON agent output from being interpreted as bash commands in verify-fixes step The verify-fixes step used python3 -c "..." with double-quoting. When the template variables {{validated_findings}} and {{fix_results}} were expanded into $VALIDATED and $FIX_RESULTS via bash variable substitution inside the double-quoted python3 -c string, any double-quotes in the JSON (all JSON keys/strings) terminated the outer bash double-quoted argument prematurely. This caused bash to interpret JSON content (e.g. "json:", "cycle:", "validated:") as shell commands, producing errors like: /bin/bash: line 17: json: command not found /bin/bash: line 19: cycle:: command not found Fix: export the bash variables and use a single-quoted heredoc (<<'PYEOF') so Python reads the JSON from os.environ instead of string interpolation. This completely isolates the JSON content from bash string parsing. Closes #2872 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [skip ci] chore: Auto-bump patch version --------- Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * fix: remove redundant single quotes from quality-audit-cycle bash templates (#2887) render_shell() already applies shlex.quote() to template variables. The manual single quotes in the YAML ('{{var}}') caused double-quoting that broke bash — JSON output was interpreted as commands. Removed single quotes from all bash step template variable assignments in the verify-fixes, update-history, and decide-continue steps. Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: split power_steering_checker.py into modular package with Copilot support (#2845) (#2910) * refactor: split power_steering_checker.py 5063 LOC into 5 modules (issue #2845) - Extract considerations.py: ConsiderationAnalysis, PowerSteeringResult, CheckerResult, PowerSteeringRedirect dataclasses and analysis logic - Extract sdk_calls.py: Claude SDK interaction layer with configurable _timeout constant - Extract progress_tracking.py: progress thresholds as configurable MODULE_PROGRESS_THRESHOLD and OVERALL_PROGRESS_THRESHOLD constants - Extract result_formatting.py: output formatting and display utilities - Extract main_checker.py: PowerSteeringChecker orchestrator, check_session, is_disabled entry points - Add __init__.py with backward-compatible re-exports (all public symbols preserved) - Fix broad except Exception blocks to log at WARNING with exc_info=True - Make hardcoded timeouts (SDK_TIMEOUT=30) and thresholds configurable module-level constants - Add comprehensive unit tests for each module (test_psc_*.py) - Add architecture documentation Closes: part of #2845 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: address security and code quality issues from PR #2872 review Security (blocking): - SEC-1: Add _validate_session_id() guard in _log_violation() before path construction (main_checker.py) — prevents path traversal via crafted session_id like '../../etc/x' - SEC-2: Add _validate_session_id() guard in _write_summary() before path construction (progress_tracking.py) — same class of bug Code quality: - Move `import sys` from inside _write_with_retry() to module level (progress_tracking.py) - Remove redundant `import sys` inside except block in sdk_calls.py; use `_sys` already imported at line 474 - Add exc_info=True to _save_redirect() ERROR log (progress_tracking.py) - Guard sys.path.insert() with `if _hook_dir not in sys.path` to prevent duplicate entries on repeated imports (main_checker.py) Requirements deviation: - Add analyze_consideration to __all__ in __init__.py; remove # noqa: F401 All 154 PSC tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: add missing re-exports and sync .claude/ copy for power_steering_checker Outside-in testing found 2 issues with the power_steering_checker refactor: 1. __init__.py was missing 5 symbols: get_shared_runtime_dir, _write_with_retry, MAX_TRANSCRIPT_LINES, CHECKER_TIMEOUT, PARALLEL_TIMEOUT. Test mock.patch() calls targeting these would fail. 2. .claude/tools/ still had the old 5200-line monolith while amplifier-bundle/ had the new package. Replaced monolith with package copy to prevent divergence. Note: considerations.py (2423 lines) still needs further splitting but that's a separate task. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: split considerations.py into 6 focused modules (issue #2845) Split ConsiderationsMixin (2423 lines) into focused sub-modules: - session_detection.py: SessionDetectionMixin (8 methods + constants) - transcript_helpers.py: TranscriptHelpersMixin (6 methods) - checks_workflow.py: ChecksWorkflowMixin (6 methods + patterns) - checks_quality.py: ChecksQualityMixin (9 methods + constants) - checks_docs.py: ChecksDocsMixin (7 methods + constants) - checks_ci_pr.py: ChecksCiPrMixin (7 methods + pattern) considerations.py now retains only dataclasses (CheckerResult, ConsiderationAnalysis, PowerSteeringRedirect, PowerSteeringResult), the _env_int helper, and ConsiderationsMixin as a shell that inherits from all 6 focused mixins with PHASE1_CONSIDERATIONS. Updated __init__.py to re-export all new mixin classes. Synced .claude/ copy with all refactored files. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: update test_power_steering_worktree.py for refactored package structure - Update mock.patch targets from power_steering_checker.get_shared_runtime_dir to power_steering_checker.main_checker.get_shared_runtime_dir (function now lives in main_checker submodule, not top-level namespace) - Fix .disabled file creation paths: current _is_disabled() checks shared_runtime/power-steering/.disabled, not shared_runtime/.disabled (tests were written against an older implementation) All 26 worktree integration tests now pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: align mixin method behavior with original monolith for 6 failing tests Fix 6 failing tests in power_steering_checker refactor (issue #2845): 1. _check_agent_unnecessary_questions (checks_quality.py): Revert to counting question marks in assistant text instead of AskUserQuestion tool calls, matching original behavior. 2. _check_documentation_updates (checks_docs.py): Revert to checking all code file modifications (not just public-facing paths), so any code change without docs update triggers the check. 3. _check_next_steps (checks_workflow.py): Revert to simple keyword matching (next steps, todo, pending, etc.) instead of requiring structured bulleted lists with regex patterns. 4. _check_review_responses (checks_ci_pr.py): Revert to checking user messages for review-related keywords instead of requiring concrete PR review CLI commands. 5. _check_unrelated_changes (checks_quality.py): Revert to simple file count heuristic (>20 files = scope creep) instead of top-level directory counting (which failed for absolute paths outside project root). 6. test_redirect_saved_on_block_decision (main_checker.py): Add deterministic override step (5a) after _analyze_considerations: run heuristics for satisfied considerations and override with False if heuristic detects concrete failure. This prevents SDK fail-open from masking real failures like incomplete TODOs. Preserves SDK-first architecture in _check_single_consideration_async. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(power_steering_checker): address 3 confirmed findings from quality audit Security/reliability/dead-code audit of all 12 modules in the refactored power_steering_checker package (PR #2872). Three confirmed findings fixed: 1. Dead code (MEDIUM) — checks_quality.py:408 Redundant `import re as _re` inside _check_interactive_testing method body; `re` is already imported at module level (line 1). Removed the late import and replaced _re.search / _re.IGNORECASE with re.search / re.IGNORECASE. 2. Dead code (MEDIUM) — checks_docs.py:181-185 Unreachable branch in _check_feature_docs_discoverable: edge case 2 had condition `and not new_features` which is always False at that point because edge case 1 already returned True when not new_features. The entire dead block was removed (behaviour unchanged — the early return in edge case 1 already covers the empty-features scenario; with feature definitions present we must not skip discoverability). 3. Silent fallback (MEDIUM) — main_checker.py:153-156 `except OSError: pass` in PowerSteeringChecker.__init__ swallowed runtime-dir creation failures with no log output. Changed to capture the exception and emit a stdlib logger.warning for observability while preserving the fail-open behaviour. All 83 tests pass (1 skipped), 0 failures. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(power_steering_checker): add Copilot transcript format auto-detection and parsing Adds transcript_parser.py with: - detect_transcript_format(): inspects first JSONL line to identify Claude Code vs GitHub Copilot CLI events.jsonl format (flat role-based or event-based) - parse_copilot_transcript(): normalizes Copilot events into the same list[dict] shape checker methods expect (type, message.role, message.content, timestamp, sessionId) - parse_claude_code_transcript(): existing passthrough behavior, no normalization - parse_transcript(): auto-detect + dispatch entry point Updates _load_transcript in main_checker.py to use parse_transcript(), so both Claude Code JSONL and Copilot events.jsonl are processed transparently. Adds 48 tests in test_transcript_parser.py covering format detection, event normalization, oversized-line safety, and _load_transcript integration. Existing Claude Code transcript parsing is unchanged (same raw dicts returned). Closes #2845 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * test(power_steering): add Copilot e2e tests and sync transcript_parser to .claude/ (#2845) Extends the Copilot transcript format support added in the preceding commit by syncing the transcript_parser module and _load_transcript integration into the .claude/ package copy, and adding a comprehensive e2e test suite with a realistic Copilot-format fixture. Changes: - Sync .claude/ power_steering_checker package with amplifier-bundle: - Add transcript_parser.py (copied from amplifier-bundle — Copilot format detection and normalization) - Update main_checker.py _load_transcript() to use parse_transcript(): auto-detects Claude Code vs Copilot CLI format, normalizes Copilot events into canonical list[dict] shape, logs format detection for observability - Add tests/fixtures/copilot_events.jsonl (both .claude/ and amplifier-bundle): Realistic 20-line Copilot CLI session: conversation_start, user message, assistant messages with tool_call/tool_result events for file writes, test execution, git commit, final assistant summary, conversation_end - Add tests/test_copilot_e2e_power_steering.py (22 tests): Full end-to-end pipeline coverage not present in test_transcript_parser.py: - Format detection: claude_code, copilot flat, copilot event, empty, fixture - Normalization: user/assistant/tool_call/tool_result, flat format, unchanged Claude Code passthrough - check_session() on raw Copilot JSONL (auto-detect path) - check_session() on pre-normalized Copilot transcript - detect_session_type() on normalized Copilot messages - result.decision in ["approve", "block"], result.reasons is list[str] - Empty session handled gracefully - Edge cases: unknown role skipped, JSON-string arguments decoded, interleaved turns, malformed JSON line skipped (fail-open) All 22 new tests pass; 48 amplifier-bundle transcript_parser tests pass; 30 existing power_steering_checker tests pass (1 skipped, pre-existing). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [skip ci] chore: Auto-bump patch version * fix(transcript_parser): support real Copilot dotted event type format The Copilot transcript parser was broken against real Copilot session data. Real Copilot sessions use dotted type names (user.message, assistant.message, session.start) with content nested under a 'data' sub-object, not the fake event-based format the fixtures previously used. Changes: - detect_transcript_format(): detect dotted type names (user.message, assistant.message, session.start etc.) as 'copilot' format - normalize_copilot_event(): handle user.message (data.content) and assistant.message (data.content, data.toolRequests); skip all lifecycle events (session.start, session.model_change, assistant.turn_start, assistant.turn_end, session.shutdown) - Replace copilot_events.jsonl fixture with real data from ~/.copilot/session-state/b1cc7005-4c26-46d7-a9e5-ebc5a882be65/events.jsonl - Add 17 new tests for real Copilot format in test_transcript_parser.py - Sync all fixes to .claude/ copy (parser, fixture, test file) Verified: parse_transcript() correctly returns format='copilot' and 2 normalized messages (user + assistant) from the real events.jsonl. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: unset CLAUDECODE env var before claude_agent_sdk.query() calls to prevent nested session errors When running inside an existing Claude Code session, the CLAUDECODE env var causes claude_agent_sdk.query() subprocess spawning to fail with: "Claude Code cannot be launched inside another Claude Code session." Fix: Add os.environ.pop("CLAUDECODE", None) at module level in all files that use claude_agent_sdk.query(), matching the pattern already used by the multitask orchestrator. Files fixed: - amplifier-bundle/tools/amplihack/hooks/claude_power_steering.py - amplifier-bundle/tools/amplihack/hooks/claude_reflection.py - amplifier-bundle/skills/pm-architect/scripts/triage_pr.py - amplifier-bundle/skills/pm-architect/scripts/generate_roadmap_review.py - amplifier-bundle/skills/pm-architect/scripts/generate_daily_status.py - src/amplihack/launcher/auto_mode.py Also syncs .claude/ and docs/claude/ copies to match amplifier-bundle. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * Update documentation for power_steering_checker refactoring and Copilot CLI support (#2911) ## Changes ### Power-Steering Documentation (docs/features/power-steering/README.md) - Updated Architecture section to reflect modular 12-file package structure - Added changelog entry for v0.10.0 (2026-03-07) with refactoring details - Documented Copilot CLI transcript support - Highlighted 76% LOC reduction (5,063 → 1,217 lines in largest module) - Added module responsibility descriptions and cross-references ### API Reference (docs/reference/power-steering-checker-api.md) - Added refactoring summary to Package Overview section - Documented 191 tests passing (121 existing + 48 parser + 22 Copilot e2e) - Noted CLAUDECODE environment variable fix for nested sessions - Cross-referenced power_steering_checker package README ### Copilot CLI Integration (docs/COPILOT_CLI.md) - Bumped version to 1.1.0 (from 1.0.0) - Added new "Copilot CLI Transcript Support" section to ToC - Documented auto-detection of Claude Code vs Copilot CLI transcript formats - Explained module structure and testing coverage - Highlighted benefits for Copilot CLI users (session completion validation) ## Context Following Diátaxis framework: - **Explanation**: Architecture changes in Power-Steering docs - **Reference**: API updates in reference documentation - **Tutorial**: Usage guidance in Copilot CLI integration docs Based on merged PRs from 2026-03-06 to 2026-03-07: - PR #2910: Major refactoring into modular package - PR #2887: Bash template quoting fix - PR #2886: JSON handling in quality-audit-cycle Co-authored-by: Claude Documentation Agent <claude-agent@amplihack.github.io> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com> * fix: recipe runner nesting — unset CLAUDECODE, tmux execution, auto-install tmux (#2912) * fix: recipe runner nesting — unset CLAUDECODE, tmux execution, auto-install tmux Three fixes for recipe runner execution inside Claude Code sessions: 1. ClaudeSDKAdapter: Added os.environ.pop("CLAUDECODE", None) before SDK query call. The parent Claude Code sets CLAUDECODE, child sessions refuse to start if present. Both adapters now strip it. 2. dev-orchestrator SKILL.md: Updated execution instructions to use tmux sessions instead of run_in_background. Claude Code's background task manager kills processes after ~10 min (Issue #2909). Recipe workstreams can take hours. Instructions now use: - tmux new-session -d for detached execution - env -u CLAUDECODE for clean env - CLISubprocessAdapter for subprocess isolation 3. Install script: Auto-installs tmux if missing (apt-get, brew, dnf). tmux is now required for recipe runner execution. Closes #2909 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * [skip ci] chore: Auto-bump patch version --------- Co-authored-by: Ubuntu <azureuser@devy.yb0a3bvkdghunmsjr4s3fnfhra.phxx.internal.cloudapp.net> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * feat: power-steering SDK abstraction — auto-select Claude or Copilot SDK (#2917) (#2918) * feat: power-steering SDK abstraction — auto-select Claude or Copilot SDK (#2917) Add power_steering_sdk.py that auto-detects the active launcher (Claude Code or GitHub Copilot CLI) via LauncherDetector and routes LLM queries to the correct SDK. One code path, two backends. - New: power_steering_sdk.py with query_llm(prompt, project_root) -> str - Refactored: claude_power_steering.py — replaced 5 identical 12-line SDK call blocks with single-line query_llm() calls (-228 lines) - Auto-detection: uses existing LauncherDetector from adaptive context - Fail-open: returns "" if neither SDK available (heuristic fallback) - Both SDKs support sessions for future optimization Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * [skip ci] chore: Auto-bump patch version * fix: correct Copilot SDK API usage — async methods, proper session lifecycle (#2917) CopilotClient methods are async coroutines, not sync. Fixed _query_copilot to await start/create_session/send_and_wait/stop. Extract response text from event.data.content. Verified against real Copilot SDK (hit version mismatch on server but API calls resolved correctly). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * fix: enforce default-workflow for single-workstream tasks (#2928) * fix: enforce default-workflow for single-workstream tasks in smart-orchestrator The single-workstream execution paths in smart-orchestrator.yaml were using `type: agent` with a builder agent and a text prompt that merely *asked* the agent to follow DEFAULT_WORKFLOW steps 0-22. This provided no enforcement -- the builder agent would skip workflow steps and implement directly. Changed `execute-single-round-1` and `execute-single-fallback-blocked` from `type: agent` to `type: recipe` with `recipe: default-workflow`, matching the multi-workstream path which already uses recipe-based execution via orchestrator.py. Parent context (task_description, repo_path) flows through automatically via the runner's context merging in _execute_sub_recipe(). Continuation rounds (execute-round-2, execute-round-3) remain as agent steps since they are incremental work referencing previous round results. Fixes #2927 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * [skip ci] chore: Auto-bump patch version --------- Co-authored-by: Ubuntu <azureuser@devy.yb0a3bvkdghunmsjr4s3fnfhra.phxx.internal.cloudapp.net> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * feat: add skill_invocation power-steering check (from closed #2916) (#2926) * feat: add skill_invocation power-steering check (extracted from #2916) Detects when a user requests a skill via slash command (<command-name> tag) but the agent bypasses it and responds directly without invoking the Skill tool. This was independently valuable work from closed PR #2916 that got dropped during the stale PR cleanup. - checks_workflow.py: Add _check_skill_invocation method - considerations.yaml: Add skill_invocation blocker for DEV/INVESTIGATION/MAINTENANCE - 12 outside-in tests (Claude + Copilot + edge cases) Related to #2914 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * [skip ci] chore: Auto-bump patch version --------- Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * fix: add OPERATIONS session type for PM/planning sessions (#2914) (#2923) * fix: add OPERATIONS session type for PM/planning sessions (#2914) Power-steering incorrectly activated dev checks on Q&A/PM sessions (e.g. /pm-architect) because Read/Grep tool usage triggered INVESTIGATION classification, which applies workflow checks. Add OPERATIONS session type detected via PM/planning keywords (prioritize, backlog, roadmap, sprint, triage, etc.) that skips all power-steering considerations — same as SIMPLE sessions. Detection priority: SIMPLE > OPERATIONS > DEVELOPMENT > INVESTIGATION. Closes #2914, Closes #2913 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * [skip ci] chore: Auto-bump patch version * test: add OPERATIONS session type detection tests (#2914) 12 unit tests covering: - PM/planning keyword detection (pm-architect, backlog, roadmap, sprint) - OPERATIONS skips all considerations - Env override AMPLIHACK_SESSION_TYPE=OPERATIONS - Priority: SIMPLE > OPERATIONS > DEVELOPMENT > INVESTIGATION - Code modifications with operations keywords Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add outside-in e2e tests for OPERATIONS session type (#2914) 8 tests covering both Claude and Copilot sessions: - Claude /pm-architect → OPERATIONS classification - Claude full check() flow → approve (no blocks) - Copilot backlog triage → OPERATIONS classification - Copilot full check() flow → approve (no blocks) - Development sessions still get full checks - Simulates real transcript structure with tool_use blocks Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * fix: add safety valve to stop hook lock mode (#2874) (#2924) * fix: add safety valve to stop hook lock mode to prevent infinite loops (#2874) When lock mode is active and the agent has completed all work, the stop hook blocks every stop attempt indefinitely — creating an infinite loop of 100+ empty block cycles consuming API tokens with zero productive work. Add a safety valve: after AMPLIHACK_MAX_LOCK_ITERATIONS (default 50) consecutive lock blocks, auto-approve the stop and remove the lock file. The user is notified via stderr and can re-enable with /amplihack:lock. The threshold is configurable via environment variable for users who need longer autonomous sessions. Closes #2874 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * [skip ci] chore: Auto-bump patch version * test: add safety valve unit tests for stop hook lock mode (#2874) 6 tests covering: - Normal lock block below threshold - Safety valve triggers at threshold (50) - Lock file removal on trigger - Custom threshold via AMPLIHACK_MAX_LOCK_ITERATIONS - Below-threshold with custom value - No lock file approves normally Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add outside-in e2e tests for stop hook safety valve (#2874) 7 tests covering both Claude and Copilot sessions: - Claude: lock blocks normally, safety valve at default threshold, custom threshold, simulated infinite loop scenario - Copilot: lock blocks, safety valve triggers - No lock mode: unaffected (still approves normally) Simulated infinite loop test reproduces the exact #2874 scenario: 3 rapid stop attempts with threshold=3, verifying block→block→approve. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * fix: enforce review and outside-in testing steps in multitask workstreams (#2925) (#2930) * fix: enforce review and outside-in testing steps in multitask workstreams (#2925) Root causes: 1. smart-orchestrator.yaml: validate-outside-in-testing had a condition requiring 'pull/' in round_1_result. For parallel/multitask workstreams, round_1_result is the orchestrator report (no PR URLs), so this step was permanently SKIPPED for all multi-workstream executions. 2. default-workflow.yaml: step-17a-compliance-verification only echoed instructions without checking local_testing_gate — it was a no-op that never blocked review when step-13 (outside-in testing) was skipped. Fixes: - smart-orchestrator.yaml: Remove 'pull/' URL check from condition. Validation now triggers for any Development task with results, letting the reviewer agent determine whether outside-in testing was done. Updated prompt handles both single-workstream (PR URLs visible) and parallel workstream (orchestrator report format) execution modes. - default-workflow.yaml: step-17a now reads {{local_testing_gate}} and exits 1 if empty, hard-blocking the review phase until step-13 is done. Tests: - tests/outside_in/test_multitask_mandatory_review_steps.py (14 tests) * Verifies condition triggers for parallel orchestrator reports * Verifies condition still skips Q&A/Operations tasks * Verifies step-17a exits non-zero when testing gate is empty * Verifies step-17a succeeds when testing gate is populated * Verifies both recipe copies are in sync Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [skip ci] chore: Auto-bump patch version --------- Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * fix: use stdin for large payloads in recipe summarize step (#2921) (#2931) * fix: use stdin for large payloads in recipe summarize step (#2921) Prevent "Argument list too long" error when processing large workstreams by passing data via stdin instead of command-line arguments in the CLI subprocess adapter. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * [skip ci] chore: Auto-bump patch version --------- Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * fix: remove CLAUDECODE env var detection, centralize stripping (#2883) * fix: remove CLAUDECODE env var detection, centralize env stripping The Claude Code binary sets CLAUDECODE to block nested sessions, but we always want nested sessions to work. This change: - Removes CLAUDECODE-based adapter selection logic from get_adapter() - Creates centralized build_child_env() in adapters/env.py that strips CLAUDECODE from all child processes in one place - Removes `unset CLAUDECODE` from shell scripts (Python adapters handle it) - Updates tests to verify env stripping without depending on detection - Updates documentation to remove CLAUDECODE as a user-facing concern Follow-up to #2845 quality improvements. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * [skip ci] chore: Auto-bump patch version --------- Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * investigate: dual status classifiers — DO-NOT-RECOMMEND unification (#2898) (#2933) * investigate: add recommendation doc and tests for dual status classifiers (#2898) Investigation finding: DO-NOT-RECOMMEND unifying the two classifiers. WorkflowClassifier (pre-action routing, string input) and SessionDetectionMixin (post-hoc enforcement, transcript input) serve fundamentally different purposes with different input types, output taxonomies, and consumers. Adds outside-in regression tests that guard against unbounded keyword drift. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * ci: trigger full CI pipeline * [skip ci] chore: Auto-bump patch version --------- Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * feat: parallel VM polling with ThreadPoolExecutor (#2896) (#2934) * feat: implement parallel VM polling with ThreadPoolExecutor (#2896) Add poll_vm_statuses() to Orchestrator and refresh_pool_statuses() to VMPoolManager. VM status polling previously had no parallel mechanism; this adds concurrent polling via ThreadPoolExecutor to dramatically reduce latency when checking many VMs simultaneously. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * ci: trigger full CI pipeline * [skip ci] chore: Auto-bump patch version --------- Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * fix: make .claude/tools/amplihack/hooks/ canonical, amplifier-bundle a symlink (#2881) (#2935) * fix: make .claude/tools/amplihack/hooks/ canonical, amplifier-bundle a symlink (#2881) Replace amplifier-bundle/tools/amplihack/hooks/ directory with a symlink to .claude/tools/amplihack/hooks/, making .claude/ the single source of truth. Changes: - Convert amplifier-bundle/tools/amplihack/hooks/ from a directory to a symlink pointing to ../../../.claude/tools/amplihack/hooks/ - Move 10 files that existed only in amplifier-bundle/hooks/ to .claude/hooks/: dev_intent_router.py, templates/routing_prompt.txt, and 8 test files - Update test_main_branch_protection.py to verify symlink structure instead of checking byte-identity of two separate copies - Update docs/features/main-branch-protection.md to reflect symlink architecture - Add outside-in tests (tests/outside_in/test_hooks_canonical_location.py) verifying the symlink structure from a user's perspective The build system (build_hooks.py) already uses symlinks=True in shutil.copytree, so the symlink is preserved correctly in wheel builds. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [skip ci] chore: Auto-bump patch version * docs: note canonical location in hooks README Add note that .claude/tools/amplihack/hooks/ is the canonical source and amplifier-bundle/tools/amplihack/hooks/ is a symlink to it. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * feat: add allow-list for safe send_input patterns (#2903) (#2936) * feat: add allow-list for safe send_input patterns (#2903) Implements a configurable allow-list mechanism for the send_input action used in gadugi-agentic-test YAML scenarios. Safe patterns (y, n, Enter, quit, exit) are accepted without confirmation; arbitrary values require explicit opt-in via confirm=True (--confirm flag). New module: src/amplihack/testing/send_input_allowlist.py - DEFAULT_SAFE_PATTERNS frozenset of common interaction responses - ALLOWLIST_ENV_VAR for loading extra patterns from a JSON file - UnsafeInputError raised on disallowed values - is_safe_pattern() / validate_send_input() / validate_scenario_send_inputs() 47 outside-in tests added in tests/outside_in/test_safe_send_input_allowlist.py Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * ci: trigger full CI pipeline * [skip ci] chore: Auto-bump patch version --------- Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * feat: add --port flag to azlin for bastion tunnel reuse (#2897) (#2937) * feat: add --port flag to azlin for bastion tunnel reuse (#2897) Reduces connection overhead by allowing SSH sessions to reuse an existing bastion tunnel instead of creating a new one per command. Changes: - VMOptions.tunnel_port: new optional int field - Executor.__init__ accepts tunnel_port; _azlin_port_args() helper injects --port <N> into all azlin subcommands (cp, connect, ssh) - SessionManager._execute_ssh_command accepts tunnel_port parameter - CLI: --port option added to both 'exec' and 'start' commands Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * ci: trigger full CI pipeline * [skip ci] chore: Auto-bump patch version --------- Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * feat: split fleet CLI - extract _cli_formatters from _cli_session_ops (#2900) (#2938) * feat: split fleet CLI - extract formatters module from session ops (#2900) - Create src/amplihack/fleet/ package with: - _cli_formatters.py: ScoutResult, AdvanceResult dataclasses and format_scout_report/format_advance_report functions (224 LOC) - _cli_session_ops.py: Fleet session lifecycle management, run_scout, run_advance - imports formatters, stays under 400 LOC (366 LOC) - __init__.py: Clean public API re-exports - Add outside-in tests (38 tests, 100% pass) covering: - Session lifecycle (start, stop, list, status) - Scout and advance agent operations - All three output formats (table, json, yaml) - Module separation verification Addresses issue #2900: extract format_scout_report and format_advance_report into separate _cli_formatters.py, keeping session ops focused. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * ci: trigger full CI pipeline * [skip ci] chore: Auto-bump patch version --------- Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * fix: use run-script file for tmux eval pipeline to fix quoting issues (#2922) (#2939) * fix: use run-script file for tmux eval pipeline to fix quoting issues (#2922) The execute_remote_tmux method previously embedded the decoded prompt directly in a tmux send-keys command: tmux send-keys -t SESSION "amplihack ... -p \"$PROMPT\"" C-m When $PROMPT contained double quotes, dollar signs, backticks, or other shell-special characters, the shell inside the tmux pane misinterpreted the command, breaking end-to-end automation. Fix: write a self-contained run script using a heredoc with a single-quoted delimiter ('AMPLIHACK_RUN_EOF'), which prevents the outer shell from expanding $(...) and $VARIABLE. Python f-string substitution inserts literal base64 values before the shell processes the heredoc. The script decodes both the prompt and API key from base64 at execution time inside the tmux session, using properly-quoted "$PROMPT" to pass the value as a single argument regardless of content. Changes: - Replace 4 fragile send-keys lines with a clean run-script approach in execute_remote_tmux() across all three executor.py copies - Add 32 outside-in tests in tests/outside_in/test_eval_pipeline_tmux.py covering special character safety, heredoc quoting, API key encoding, tmux session creation, bash syntax validity, and regression prevention All 19 existing unit tests and 32 new outside-in tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * ci: trigger full CI pipeline * [skip ci] chore: Auto-bump patch version --------- Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * feat: cache SSH output between discovery and reasoning phases (#2899) (#2942) * feat: cache SSH output between discovery and reasoning phases (#2899) Add TTL-based SSH output cache to SessionManager so repeated capture_output() calls within the TTL window reuse cached results instead of re-running SSH commands. Reduces SSH overhead during the discovery→reasoning transition. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * ci: trigger full CI pipeline --------- Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: Oxidizer recipe — automated Python-to-Rust migration workflow (#2950) * feat: Oxidizer recipe — automated Python-to-Rust migration workflow Adds the oxidizer-workflow recipe, skill definition, and documentation. - amplifier-bundle/recipes/oxidizer-workflow.yaml: 65-step recipe with iterative convergence loops, quality audits, and zero-tolerance parity - .claude/skills/oxidizer-workflow/SKILL.md: Skill definition with activation keywords and usage examples - docs/OXIDIZER.md: Full documentation covering all phases, context variables, and the zero-tolerance policy - mkdocs.yml: Navigation entries for the workflow and skill Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * [skip ci] chore: Auto-bump patch version --------- Co-authored-by: Ubuntu <azureuser@devo.xh24nwhiyviedbtbx54dafh01e.dx.internal.cloudapp.net> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * feat: integrate /top5 priority aggregation into pm-architect (#2941) * feat: integrate /top5 priority aggregation into pm-architect Merge top5 priority aggregation directly into pm-architect rather than as a standalone skill. Adds Pattern 5 (Quick Priority View) and Pattern 6 (Daily Standup) to the orchestrator. Files added/modified: - scripts/generate_top5.py: Aggregates priorities across backlog-curator, workstream-coordinator, roadmap-strategist, and work-delegator into a strict Top 5 ranked list (weights: 35/25/25/15) - scripts/tests/test_generate_top5.py: 31 unit tests covering extraction, aggregation, ranking, tiebreaking, and edge cases - scripts/tests/conftest.py: Added pm_dir, sample_backlog_items, and populated_pm fixtures for top5 testing - SKILL.md: Added /top5 trigger, Pattern 5 and 6, updated scripts list Closes milestone 1 of #2932 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: GitHub-native data sourcing for /top5 Rewrites generate_top5.py to query live GitHub data instead of reading static .pm/ YAML files. Sources issues and PRs across multiple GitHub accounts via gh CLI search API. - Reads .pm/sources.yaml for account/repo configuration - Fetches open issues and PRs via gh api search/issues - Scores by: label priority, staleness, comment activity, draft status - Falls back to .pm/backlog/ for local overrides when GitHub unavailable - Restores original gh account after multi-account queries - Weights: issues 40%, PRs 30%, roadmap alignment 20%, local 10% Tested live: 116 candidates from rysweet + rysweet_microsoft accounts 29 unit tests passing (mocked gh calls + aggregation logic) Implements Milestone 0 of #2932 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address review findings in /top5 implementation - Remove dead `env` variable and unused comment in run_gh() - Simplify get_current_gh_account() from fragile stderr parsing to `gh api user --jq` - Fix lstrip("-* ") bug in extract_roadmap_goals() using removeprefix() - Rename shadow variable `l` to `lbl` in list comprehensions for readability - Fix SKILL.md weight documentation to match actual code (40/30/20/10) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * [skip ci] chore: Auto-bump patch version * test: add outside-in agentic test scenarios for /top5 trigger Five gadugi-agentic-test YAML scenarios covering: - Smoke test: valid JSON output with expected keys - GitHub source aggregation: end-to-end with real sources.yaml - Error handling: malformed YAML, missing dirs, empty sources - Local overrides: .pm/backlog + roadmap alignment scoring - Ranking enforcement: top-5 limit, rank fields 1-5, score ordering Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: enriched /top5 output with score breakdown, actions, near-misses, and repo summary The previous output was a flat ranked list with no context for decision-making. Now includes: - Score breakdown per item (label_priority, staleness, activity components) - Suggested action per item ("Merge, close, or rebase", "Fix immediately", etc.) - Near-misses (items #6-#10 that just missed the cut) - Per-repo summary (issue/PR counts, high-priority counts) - Per-account summary (total work across repos) - Full metadata preserved (labels, dates, days_stale, draft status, comments) 40 tests passing (up from 29), covering new features: - suggest_action logic for all source types - near_misses return from aggregate_and_rank - build_repo_summary grouping and counting Refs #2932 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: sync skill mirrors and remove ephemeral .pm/ state files - Sync all new pm-architect files to amplifier-bundle/skills/ and docs/claude/skills/ (fixes "Check skill/agent drift" CI failure) - Remove .pm/ ephemeral state files (backlog, workstreams, delegations, config) that Repo Guardian correctly flagged as point-in-time state - Add .pm/ to .gitignore to prevent future accidental commits Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * [skip ci] chore: Auto-bump patch version * feat: Oxidizer recipe — automated Python-to-Rust migration workflow (#2950) * feat: Oxidizer recipe — automated Python-to-Rust migration workflow Adds the oxidizer-workflow recipe, skill definition, and documentation. - amplifier-bundle/recipes/oxidizer-workflow.yaml: 65-step recipe with iterative convergence loops, quality audits, and zero-tolerance parity - .claude/skills/oxidizer-workflow/SKILL.md: Skill definition with activation keywords and usage examples - docs/OXIDIZER.md: Full documentation covering all phases, context variables, and the zero-tolerance policy - mkdocs.yml: Navigation entries for the workflow and skill Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * [skip ci] chore: Auto-bump patch version --------- Co-authored-by: Ubuntu <azureuser@devo.xh24nwhiyviedbtbx54dafh01e.dx.internal.cloudapp.net> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Ubuntu <azureuser@devy.yb0a3bvkdghunmsjr4s3fnfhra.phxx.internal.cloudapp.net> Co-authored-by: Ubuntu <azureuser@devo.xh24nwhiyviedbtbx54dafh01e.dx.internal.cloudapp.net> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat: Rust recipe runner integration with engine selection (#2951) Replaces Python recipe runner with Rust implementation from rysweet/amplihack-recipe-runner. - Engine selection via RECIPE_RUNNER_ENGINE env var (rust/python/auto-detect) - Auto-installs via cargo on first use (ensure_rust_recipe_runner) - Nested session depth enforcement (AMPLIHACK_MAX_DEPTH) - Non-interactive footer for autonomous agent execution - Configurable timeouts via env vars - Context value redaction in logs - 34 tests covering all paths Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: classic mode launcher hangs indefinitely due to multi-line -p arg (#2958) * fix: put classic launcher -p argument on single line to prevent hang (#2946) The classic mode launcher wrote a multi-line -p argument that the shell split at newlines, causing amplihack claude to wait on stdin indefinitely. Collapse the prompt onto a single line so the entire -p value is passed as one argument. Add 3 regression tests verifying no newlines in the -p argument. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * [skip ci] chore: Auto-bump patch version --------- Co-authored-by: Ubuntu <azureuser@deva.ftnmxvem3frujn3lepas045p5c.xx.internal.cloudapp.net> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * fix: ensure Rust recipe runner on startup, add cargo to prerequisites (#2957) * fix: ensure Rust recipe runner on startup, add cargo to prerequisites - Add ensure_rust_recipe_runner() call to copilot launcher startup - Add Rust/cargo to Prerequisites in README.md and PREREQUISITES.md - Add cargo install instructions to all platform sections (macOS, Ubuntu, Fedora, Arch) - Add cargo --version to verification commands Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat: add Rust recipe runner check to all startup paths Move ensure_rust_recipe_runner() from copilot-only to a shared _ensure_rust_recipe_runner() function called from all 6 launcher entry points: launch, claude, RustyClawd, copilot, codex, amplifier. Previously only copilot and install had the check, leaving 4 paths uncovered. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * refactor: consolidate launcher startup into _common_launcher_startup() Extract nesting detection, framework staging, Rust recipe runner check, SDK dep check, and power-steering prompt into a single idempotent function called from all 5 launcher entry points. Before: launch_command() had 7 init steps; copilot/codex/amplifier only had staging + rust runner. Now all paths get identical init. Also fixes 6 pre-existing test failures in test_cli_claude_command_guard by mocking _common_launcher_startup (staging sys.exit(1) was leaking through the test harness). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test: add 19 tests for _common_launcher_startup() Covers: - Idempotency guard (double-call safe for RustyClawd → launch_command) - subprocess_safe skip - Nesting detection and auto-staging - Startup steps order (staged → rust → sdk → power-steering) - Non-fatal failure handling for SDK deps and power-steering - _ensure_rust_recipe_runner output (success, warning, import error) - All 6 launcher paths call _common_launcher_startup Outside-in verified: each launcher command (launch, claude, RustyClawd, copilot, codex, amplifier) shows 'Rust recipe runner available' in real subprocess output. Co-authored-by: Copilot <223556219+Copilo…
Rust Recipe Runner Integration
Integrates the standalone Rust recipe runner into amplihack with automatic engine selection and startup dependency management.
What's included
src/amplihack/recipes/rust_runner.py— Binary wrapper withfind_rust_binary(),ensure_rust_recipe_runner(), andrun_recipe_via_rust(). RaisesRustRunnerNotFoundErrorwhen the Rust engine is explicitly selected but the binary is missing (no silent fallback).src/amplihack/recipes/__init__.py— Engine selection viaRECIPE_RUNNER_ENGINEenv var:rust→ Rust binary only (fails if not installed)python→ Python runner onlysrc/amplihack/install.py— Step 6.5 automatically installs recipe-runner-rs duringamplihack installifcargois availabletests/recipes/test_rust_runner.py— 26 tests covering binary discovery, execution, JSON parsing, engine selection, and the ensure flowdocs/recipes/README.md— Documents engine selection table and auto-install.gitignore— Excludes recipe runner checkout directoryDesign Principles
RECIPE_RUNNER_ENGINE=rust, the binary must exist or execution fails with a clear errorensure_rust_recipe_runner()triescargo install --gitbut doesn't block startup if it failsTesting
uv run pytest tests/recipes/test_rust_runner.py -v # 26 tests